A Prototype for Authorship Attribution Studies

نویسندگان

  • Patrick Juola
  • John Sofko
  • Patrick Brennan
چکیده

Despite a century of research, statistical and computational methods for authorship attribution are neither reliable, well-regarded, widely-used, or well-understood. This paper presents a survey of the current state-ofthe-art as well as a framework for uniform and unified development of a tool to apply the state-of-the-art, despite the wide variety of methods and techniques used. The usefulness of the framework is confirmed by the development of a tool using that framework that can be applied to authorship analysis by researchers without a computing specialization. Using this tool, it may be possible both to expand the pool of available researchers as well as to enhance the quality of the overall solutions (for example, by incorporating improved algorithms as discovered through empirical analysis [Juola, 2004a]).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

More than Word Frequencies: Authorship Attribution via Natural Frequency Zoned Word Distribution Analysis

With such increasing popularity and availability of digital text data, authorships of digital texts can not be taken for granted due to the ease of copying and parsing. This paper presents a new text style analysis called natural frequency zoned word distribution analysis (NFZ-WDA), and then a basic authorship attribution scheme and an open authorship attribution scheme for digital texts based ...

متن کامل

Explaining Delta, or: How do distance measures for authorship attribution work?

Authorship Attribution is a research area in quantitative text analysis concerned with attributing texts of unknown or disputed authorship to their actual author based on quantitatively measured linguistic evidence (see Juola 2006; Stamatatos 2009; Koppel et al. 2009). Authorship attribution has applications in literary studies, history, forensics and many other fields, e.g. corpus stylistics (...

متن کامل

Questioned Electronic Documents : Empirical Studies in Authorship Attribution

Forensic analysis of questioned electronic documents is very difficult, because the nature of the documents eliminates many kinds of informative differences. Recent work in authorship attribution demonstrates the practicality of analyzing documents based on authorial style, but the state of the art is confusing. Analyses are difficult to apply, little is known about type or rate of errors, and ...

متن کامل

Authorship Attribution Using Word Network Features

In this paper, we explore a set of novel features for authorship attribution of documents. These features are derived from a word network representation of natural language text. As has been noted in previous studies, natural language tends to show complex network structure at word level, with low degrees of separation and scale-free (power law) degree distribution. There has also been work on ...

متن کامل

A Survey on Authorship Analysis

The paper discusses about the problem of Authorship analysis, different types of authorship analysis’s such as authorship attribution, authorship identification, authorship profiling, plagiarism detection. It also addresses the issues in Indian language text. Keywords— Authorship attribution, authorship profiling, plagiarism detection, text classification.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • LLC

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2006